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Abstract 


Motivated  by  the  problem  of  minefield  detection,  we  investigate  the  problem  of  classifying 
mixtures  of  spatial  point  processes.  In  particular  we  are  interested  in  testing  the  hypothe¬ 
sis  that  a  given  dataset  was  generated  by  a  Poisson  process  versus  a  mixture  of  a  Poisson 
process  and  a  hard-core  Strauss  process.  We  propose  testing  this  hypothesis  by  comparing 
the  evidence  for  each  model  by  using  partial  Bayes  factors.  We  use  the  term  partial  Bayes 
factor  to  describe  a  Bayes  factor,  a  ratio  of  integrated  likelihoods,  based  on  only  part  of  the 
available  information,  namely  that  information  contained  in  a  small  number  of  functionals  of 
the  data.  We  applied  our  method  to  both  real  and  simulated  data,  and  considering  the  dif¬ 
ficulty  of  classifying  these  point  patterns  by  eye,  our  approach  overall  produced  good  results. 


KEY  WORDS:  Bayes  Factors,  Strauss  Process. 
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1  Introduction 


We  investigate  the  problem  of  comparing  competing  models  for  spatial  point  process  data. 
In  particular  we  are  interested  in  testing  the  hypothesis  that  the  data  were  generated  by  a 
Poisson  process  (i.e  complete  spatial  randomness)  versus  a  mixture  of  a  Poisson  process  and 
an  inhibited  process.  The  motivation  behind  this  methodology  is  the  problem  of  minefield 
detection.  An  aerial  view  of  a  possible  minefield  has  been  imaged.  This  image  is  processed 
into  a  set  of  object  locations.  Each  object  is  either  a  mine  or  can  be  considered  to  be  clutter 
or  noise.  The  mines  are  assumed  to  be  laid  out  in  such  a  way  that  two  mines  are  unlikely 
to  be  close  together. A  hard-core  Strauss  process  is  one  way  to  model  this  inhibition.  The 
noise  points  are  assumed  to  be  located  randomly  throughout  the  study  region.  The  inherent 
difficulty  of  this  problem  can  be  seen  in  Figures  1(a),  1(b),  and  1(c).  This  is  a  problem  where 
the  human  eye  offers  few  visual  cues,  yet  statistical  techniques  can  produce  surprisingly  good 
results. 

The  problem  of  comparing  a  simple  model  for  inhibition  or  clustering  versus  complete 
spatial  randomness  was  considered  by  Diggle  (1983)  and  Cressie  (1993).  The  problem  of 
classifying  mixtures  of  spatial  point  processes  was  tackled  by  Raghavan,  Goel,  and  Ghosh 
(1997,  1998)  .  Their  approach  was  to  develop  a  supervised  pattern  recognition  scheme  using 
functionals  based  on  nearest  neighbor  distances,  second  order  statistics  and  spatial  tessella¬ 
tions.  We  propose  comparing  the  evidence  for  each  model  directly  by  using  partial  Bayes 
factors.  We  use  the  term  partial  Bayes  factor  to  describe  a  Bayes  factor,  a  ratio  of  inte¬ 
grated  likelihoods,  based  on  only  part  of  the  available  information,  namely  that  information 
contained  in  a  small  number  of  functionals  of  the  data. 

In  the  following  sections  we  describe  the  spatial  point  process  models  we  use  and  formulate 
the  minefield  problem  as  a  hypothesis  testing  problem.  We  briefly  review  Bayes  factors  and 
define  partial  Bayes  factors  in  Section  4.  In  Section  5  we  discuss  possible  summary  statistics 
one  could  use.  Results  of  applying  our  method  to  simulated  and  real  data  are  presented  in 
Section  6  and  are  discussed  in  Section  7. 

2  Point  Process  Models 

First  we  give  some  notation.  To  avoid  possible  ambiguities  associated  with  the  word  “point”, 
we  shall  refer  to  locations  of  objects  as  events,  and  let  the  word  point  refer  to  any  location 
in  the  sample  space.  The  sample  space  or  study  region  will  be  denoted  by  A,  and  |A|  will 
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denote  the  area  of  this  region.  We  will  consider  each  event  to  be  one  of  two  types,  “noise 
events”  and  “mines” .  Let  N  be  the  total  number  of  events,  no  be  the  number  of  noise  events, 
and  m  be  the  number  of  mines.  Let  dij  be  the  distance  between  the  and  events,  and 
let  di  =  miuj  d^j.  We  shall  condition  on  the  number  of  events,  N,  and  the  study  region.  A, 
throughout.  Let  Y  =  {yi, . . .  ,ypf)  be  a  random  vector  taking  values  in  ,  that  represents 
the  locations  of  all  events  in  A. 


2.1  Noise  process 

As  was  mentioned  in  the  introduction,  the  noise  events  are  considered  to  be  scattered  ran¬ 
domly  throughout  A.  Under  the  hypothesis  that  no  minefield  is  present,  and  given  that  we 
are  conditioning  on  the  number  of  events  in  A,  the  distribution  of  Y  is  uniform  over  A^ ,  i.e.: 

P,(Y)  =  jY. 

We  call  this  a  uniform  process^  and  we  denote  it  by  U  ~  Uniform(A^,  A). 


2.2  Minefield  process 

The  mines  are  assumed  to  be  spread  evenly  over  A.  This  implies  that  the  minefield  process 
displays  inhibition.  A  simple  model  for  an  inhibited  process  is  the  Strauss  process  (Strauss 
1975;  Kelly  and  Ripley  1976).  The  likelihood  for  the  Strauss  process  is: 


i<j 


where  g{-)  is  the  interaction  function,  given  by: 


0  <  d  <  p,  where  7  e  [0, 1] 
d>  p, 


and  where  the  parameters  of  the  Strauss  process  are  denoted  by  d  =  {p,  7}.  The  extent  of 
the  interactions  between  two  events  is  controlled  by  p,  while  the  nature  of  these  interactions 
is  determined  by  7.  If  7  =  0  the  process  is  known  as  a  hard-core  process.  In  this  process, 
two  events  are  forbidden  to  be  within  distance  p  of  each  other.  Alternatively,  if  7  =  1,  the 
process  is  simply  a  uniform  process  on  A.  Values  of  7  between  0  and  1  discourage  but  do 
not  forbid  events  to  be  within  distance  p  of  each  other.  Note  that  the  normalizing  constant, 
C,  of  the  Strauss  process  can  be  difficult  to  calculate,  especially  for  processes  demonstrating 
strong  inhibition  (see  Diggle,  Fiksel,  Grabarnik,  Ogata,  Stoyan,  and  Tanemura  1994). 
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2.3  Mixture  process 


Consider  a  superposition  of  a  Strauss  process  upon  a  uniform  process.  Let  Y  =  UYg, 
where  are  the  events  generated  by  the  uniform  process  and  Yg  are  the  events  generated  by 
the  Strauss  process.  Let  Z  he  a  variable  indicating  to  which  group  each  observation  belongs, 

i.e. 

Jo,  if  r/i  e  i; 

\  1,  ifViEYg, 

for  i  =  1, ...  ,N.  Note  that  the  number  of  Strauss  events  (mines).  If  Z  is 

known,  then  the  likelihood  for  the  mixture  process  can  be  written  as: 

p„(F  \z,e)  =  p^{Y^  I  z,  0)  X  Pg{Yg  I  z,  e). 

If  Z  is  unknown  (as  would  be  the  case  in  practice)  then  we  must  sum  over  all  the  values  of 
Z,  multiplied  by  their  respective  probabilities,  i.e. 


Pm{Y\e)  =  Y,Prn{Y\z,e)T,{z\e) 

zez 

N 

=  E  E  Pm{Y  I  Z,  0)7r(Z  I  m,9)7T{m  \  6). 

m=0  zeZ\Y.z=m 

Given  the  problem  of  obtaining  the  normalizing  constant  for  a  Strauss  process,  this  sum  is 
extremely  difficult  to  compute. 

3  Formulation  of  the  Minefield  Problem  as  a  Hypoth¬ 
esis  Testing  Problem 

We  cast  the  minefield  problem  in  terms  of  two  competing  hypotheses.  Here  we  model  the 
minefield  process  as  a  hard-core  Strauss  model.  Thus,  for  a  given  point  pattern  Y,  the 
competing  hypotheses  of  interest  are: 

Hq  :  No  minefield  present 
Y  rsj  Uniform(iV,  T); 

Hi  :  Minefield  present 
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V  =  Yy^UYs  where 
Yu  Uniform(no,  A), 

Ys  ~  Strauss(m,  A,  7  =  0), 
and  m  +  no  =  iV. 

3.1  Prior  Specification 

Under  ifi,  there  are  two  unknown  model  parameters,  p  and  m.  In  a  Bayesian  framework, 
one  must  specify  a  prior  distribution  7r(p,  m)  on  p  and  m.  The  prior  could  be  decomposed 
in  the  following  ways: 

1.  'K{p,m)  =  7r(p)  X  7r(m)  (Assuming  p  and  m  are  independent.) 

2.  'K{p,m)  =  7r(p  I  m)  x  7r(m) 

3.  'K{p,m)  =  7r(m  |  p)  x  7r(p). 

If  good  prior  information  about  the  number  of  mines  and  inhibition  distance  is  available, 
then  the  independence  assumption  of  the  first  prior  may  be  reasonable.  However,  given  that 
we  are  conditioning  on  the  study  region.  A,  and  the  total  number  of  events,  there  exists  a 
constraint  on  the  maximum  separation  between  two  events  and  the  total  number  of  mines 
in  A.  Diggle  (1983)  noted  that  the  maximum  proportion  of  a  finite  region.  A,  that  can  be 
covered  by  non-overlapping  discs,  of  radius  p,  is  achieved  when  the  discs  are  packed  in  an 
equilateral  triangular  lattice.  This  suggests  that  the  maximum  value  of  p,  given  that  there 
are  m  points  in  A,  (ignoring  edge  effects)  is: 

Pmax 

This  bound  will  be  useful  in  setting  a  prior  for  p  conditional  on  m.  For  instance  if  one  has 
only  a  vague  idea  of  the  number  of  mines,  but  knows  that  they  are  closely  packed  together, 
then  one  could  use  priors  of  the  following  form: 

7r(m)  =  Discrete  Uniform(mi,  m2) 

7r(p  I  m)  =  Uniform(Q;ip„a^,  a^pmax),  where  0  <  CTi  <  0:2  <  1 

This  is  the  form  of  the  prior  distributions  we  used  in  our  simulation  study  in  Section  6. 
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4  Partial  Bayes  Factors 


In  this  section  we  briefly  introduce  Bayes  factors  and  define  what  we  mean  by  partial  Bayes 
factors.  Consider  data  Y  that  are  assumed  to  have  arisen  under  one  of  the  two  competing 
hypotheses,  Hq  or  Hi.  Let  0*  be  a  dj-dimensional  vector  of  parameters  associated  with 
hypothesis  iL*  (i  =  1,2),  and  let  7rj(0j  |  if*)  denote  its  prior  distribution.  Let  the  the 
probability  density  of  Y  given  the  value  of  9i,  i.e.  the  likelihood  function,  be  denoted  by 
P{Y  I  The  Bayes  factor  for  Hi  against  Hq  is  the  ratio  of  the  posterior  to  the  prior 

odds  for  Hi  against  Hq,  namely: 


^  P{Hi  I  r)  /P{H,) 

P(H„  I  Y)/  P(H„) 
P(Y  I  HQ 
P{Y  I  //o) 


JP(Y\ 

9i,Hi)t,i{9i 

Hi)  d9i 

JP{Y\ 

O 

to 

o 

HQ)d9Q 

In  other  words  the  Bayes  factor  is  the  ratio  of  integrated  likelihoods.  The  Bayes  factor 
provides  evidence  for  one  hypothesis  over  another.  Kass  and  Raftery  (1995)  review  the 
history,  development,  and  use  of  Bayes  factors.  A  guide  for  interpreting  Bayes  factors, 
proposed  by  Kass  and  Raftery  (based  on  Jeffreys  1961),  is  given  in  Table  1. 


Table  1:  Guide  for  interpreting  Bayes  factors. 


21ogg(Rio) 

Biq 

Evidence  for  Hi 

0  to  2 

1  to  3 

Weak 

2  to  5 

3  to  12 

Positive 

5  to  10 

12  to  50 

Strong 

>  10 

>  150 

Decisive 

In  the  mixture  models  we  consider,  it  is  possible  to  simulate  realizations  from  each  model, 
but  it  is  difficult  to  write  down  the  likelihood  explicitly  since  the  normalizing  constant  and 
the  group  memberships,  Z,  are  unknown.  Instead,  we  use  the  partial  Bayes  factor,  defined  as 
the  ratio  of  integrated  likelihoods  for  a  summary  statistic,  X  (or  a  vector  of  several  summary 
statistics,  X),  rather  than  for  the  complete  data: 
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This  can  be  written  as: 


P{X  I  H, 
P{X  I  H,) 


SP(x\ 

Hi)dei 

SP(x\ 

eo,Ho)7i2{eo 

Ho)deo 

Io{xY 


We  can  calculate  these  integrated  likelihoods  by  quadrature  methods  or  by  Monte  Carlo 
integration.  If  =  {6\^\ . . .  is  a  random  sample  of  size  K  from  the  prior  under 

hypothesis  i,  and  P{X  \  is  an  estimate  of  P{X  \  6l^\Hi),  then  the  Monte  Carlo 

estimate  of  /*  is: 


Ux)  =  ^^p{x\eP,H,) 

J=1 


To  obtain  the  estimated  density  function  P{X  \  Hi)^  we  simulate  100  point  patterns 
from  Hi  with  parameters  0\^\  and  calculate  their  summary  statistics.  Let  these  100  summary 
statistics  be  denoted  by  xj^\  A  standard  density  estimation  procedure,  such  as  kernel 
density  estimation  (Silverman  1986),  is  then  applied  to  Xi  to  obtain  P(X  |  0\^\Hi). 

Obviously  the  selection  of  X  is  important.  We  discuss  choices  of  X  below.  Note  that 
nowhere  do  we  assume  that  X  is  univariate.  A  bivariate  or  higher  dimensional  statistic 
may  give  better  discrimination  between  the  hypotheses.  However,  this  may  lead  to  excessive 
computation  as  density  estimation  in  more  than  one  dimension  can  be  difficult. 


5  Summary  Statistics 

The  types  of  summary  statistics  considered  by  Raghavan,  Goel  and  Ghosh  (hereafter  RGG) 
for  their  supervised  pattern  recognition  scheme  fell  into  three  main  categories:  nearest  neigh¬ 
bor  distances,  second  order  statistics,  and  spatial  tessellations.  We  describe  each  category 
below. 
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5.1  Nearest  Neighbor  Distances 


The  empirical  cumulative  distribution  function  (CDF)  of  the  nearest  neighbor  distances 
between  all  events  is  given  by: 

1  " 

^N{d)  = —'^l{di<d},  d>0. 

^  i=l 

This  function  can  highlight  differences  in  small-scale  interactions  between  different  point 
process  models.  A  similar  function  to  G{d)  is  F{d),  the  empty  space  function.  This  is  the 
CDF  of  the  distance  of  an  arbitrary  fixed  point  in  A  to  the  nearest  point  of  the  spatial  point 
pattern.  RGG  recorded  the  minimum,  the  mean,  the  coefficient  of  variation,  skewness  and 
kurtosis  of  G(-)^,  and  F(-)^,  as  well  as  the  ratio  G(-)^/F(-)^. 

5.2  Second  Order  Statistics 

The  K-function,  (Bartlett  1964;  Ripley  1976,  1977;  Cressie  1993,  Ch.  8)  has  been  used 
extensively  as  an  exploratory  tool  for  the  analysis  of  point  patterns,  in  particular  their 
second  order  statistics.  For  a  spatial  point  process  of  intensity  A,  it  is  defined  as: 


K{d)  =  A  ^  E  (#  of  events  within  distance  d  of  an  arbitrary  event). 


An  estimator  that  corrects  for  edge  effects  was  given  by  Ripley  (1976): 

lAI  ^  ^ 

^iA{dij<dh  d>0, 

i=l 

where  Wij  is  the  proportion  of  the  circumference  of  a  circle  centered  at  event  i  that  passes 
through  event  j,  that  is  inside  the  study  region  A.  The  intensity  of  the  spatial  point  process, 
A,  is  estimated  by  A  =  N/\A\. 

If  the  underlying  process  over  a  region  with  area  |A|  is  uniform,  then  the  distribution  of 
events  within  a  ball  of  radius  d  around  a  given  event,  assuming  the  ball  is  contained  in  A, 
is  binomial  with  mean  np  =  Nird'^ /\A\.  Thus,  K{h)  =  and  y/K{d)/7T  versus  d  is  a  line 
of  slope  one  through  the  origin.  RGG  proposed  the  following  two  statistics  based  on  the 
K-function: 
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1.  The  difference  between  the  area  under  y  K{h)/ it  and  the  45°  line  over  the  initial  part 
(from  minj(dj)  to  maxj(dj))  of  the  curve:  i.e. 


maxi(di)  _ 

J  (^\Ik{u)/ti  -u^ 


du 


2.  The  slope  of  y  K{h)/Ti'  from  minj(dj)  to  maxj(dj). 


We  propose  another  statistic  based  on  the  K-function.  Under  strict  inhibition,  Isham 
(1984)  showed  that  in  the  plane  the  K-function  for  the  Strauss  process  with  7  =  0  is  approx¬ 
imately: 


0  <  d  <  p 
d  >  p. 


For  this  process,  clearly  there  is  a  change  in  the  K-function  at  the  point  p  which  defines 
the  inhibition  process.  Even  for  the  K-function  of  a  mixture  process,  we  expect  a  change  in 
the  behavior  of  the  estimated  K-function,  since  it  is  a  mixture  of  the  inhibited  K-function 
and  the  uniform  K-function.  We  can  estimate  p  by: 


p  =  arg  mjn  y  K  (d)  / it  —  d, 


which  we  use  as  a  summary  statistic. 

Figure  2  shows  the  K-functions  for  the  spatial  point  patterns  of  Figure  1.  The  differences 
between  the  plots  are  apparent:  the  K-functions  of  the  Strauss  process  and  the  mixture 
process  both  have  sharp  drops  at  h  =  p,  while  the  K-function  of  the  Poisson  process  is 
stationary  with  mean  zero. 


5.3  Spatial  Tessellations 

RGG  also  investigated  using  spatial  tessellations  (e.g.  see  Okabe,  Boots,  and  Sugihara  1992) 
to  distinguish  between  point  process  models.  The  simplest  spatial  tessellation  is  the  Voronoi 
tessellation.  Here  every  point  in  A  is  associated  with  the  nearest  event  in  A.  This  results  in 
the  study  region.  A,  being  partitioned  into  polygonal  tiles  (or  Voronoi  cells)  (see  Figure  3). 
RGG  found  the  second  central  moment  of  the  areas  of  the  Voronoi  cells  to  be  a  good  summary 
statistic. 
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6  Simulation  Study  &  Data  Analysis 


We  performed  a  simulation  study  to  assess  the  performance  of  partial  Bayes  factors  in  the 
minefield  problem.  The  simulation  study  is  a  simple  2^  factorial  design.  The  two  factors 
we  considered  were:  the  number  of  noise  events,  no,  and  the  amount  of  prior  information. 
Various  other  factors  could  have  been  considered,  including  the  number  of  mines,  and  the 
inhibition  distance.  The  parameters  used  in  the  simulation  study  are  given  in  Table  2. 

Table  2:  Parameters  of  simulation  study. 


Variable 

Value 

A 

(0,  If 

m 

50 

no 

30,  50 

P 

2pmax 

We  shall  refer  to  the  different  noise  levels  as  being  high  (no  =  50)  or  low  (no  =  30). 
Typical  realizations  of  each  of  these  spatial  point  processes  are  shown  in  Figure  4.  As  one 
can  see,  neither  of  these  point  patterns  is  easily  distinguished  by  eye  from  a  realization  of  a 
uniform  process. 

6.1  Priors 

We  decomposed  the  prior  distribution  on  p  and  m  in  the  following  way. 

7r{p,m)  =  7r(p  |  m)  x  7r(m) 

7r(p  I  m)  =  Uniform(Q;ip^„,j;,  oi2pmax) 

'K{m)  =  Discrete  Uniform([/5iA^, /52A^J), 

where  [.J  is  the  floor  function. 

We  selected  three  sets  of  values  for  ai,  a2,  /3i,  and  132,  that  would  correspond  to  “diffuse” , 
“compact” ,  and  “tight”  priors.  These  are  shown  in  Table  3.  We  also  created  a  prior  corre¬ 
sponding  to  “perfect”  prior  information,  i.e.  a  prior  with  point  mass  on  m  =  50,p  =  ^Pmax- 
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(a)  Low  noise. 


(b)  High  noise. 


Figure  4:  Typical  minefields  of  simulation  study. 


Table  3:  Parameters  of  prior  distributions. 


Prior 

(Xi 

Oi2 

A 

132 

Diffuse 

0.3 

0.7 

0.10 

0.90 

Compact 

0.4 

0.6 

0.30 

0.70 

Tight 

0.45 

0.55 

0.40 

0.60 

6.2  Summary  Statistics 

Two  different  summary  statistics  were  considered  to  calculate  the  partial  Bayes  factors.  The 
first  summary  statistic  was  based  on  the  K-function,  and  the  second  (due  to  RGG)  on  the 
Voronoi  tessellation.  We  shall  refer  to  them  as  Xk  and  Wy,  respectively. 


Xy  =  Second  central  moment  of  the  areas  of  the  Voronoi  cells 
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6.3  Edge  Effects 


Both  of  the  above  statistics  can  suffer  from  edge  effects.  Edge  effects  occur  because  events 
near  the  boundary  have  fewer  neighbors  than  events  in  the  central  part  of  the  study  area. 
We  accounted  for  edge  effects  by  generating  all  point  patterns  on  a  region  with  a  20%  border. 
Thus,  instead  of  generating  a  point  pattern  with  N  events  on  the  unit  square,  we  generated 
[1.96  X  N\  events  on  (—0.2, 1.2)^.  The  factor  1.96  is  the  ratio  of  the  areas  of  the  two  regions. 

6.4  Results 

We  simulated  100  point  patterns  on  the  unit  square  (accounting  for  edge  effects  as  described 
above)  under  each  hypothesis,  and  for  each  value  of  no  (i.e.  a  total  of  400  datasets).  We 
calculated  the  partial  Bayes  factors  for  each  dataset  using  both  summary  statistics,  and 
Xy,  and  the  prior  distributions  given  in  Table  3.  The  partial  Bayes  factors  are  calculated 
in  terms  of  evidence  for  Hi  over  Hq.  The  integration  was  performed  using  simple  numerical 
quadrature.  The  misclassification  rates  are  given  in  Table  4.  From  this  table  we  can  see 
that  the  total  misclassification  rate  (assuming  that  each  hypothesis  is  equally  likely  a  priori) 
was  22.5%  for  the  partial  Bayes  factor  based  on  the  K-function,  and  25.5%  for  the  partial 
Bayes  factor  based  on  the  Voronoi  tiling.  While  these  total  error  rates  are  similar,  the  partial 
Bayes  factor  based  on  X^  was  more  successful  at  correctly  classifying  minefields  than  noise 
processes.  The  opposite  was  true  of  the  partial  Bayes  factor  based  on  Xy-  As  one  would 
expect,  an  increase  in  the  amount  of  (correct)  information  contained  in  priors  improves  the 
discrimination  in  both  cases. 

The  histograms  of  these  partial  Bayes  factors  are  shown  in  Figures  5,  6,  7,  and  8.  These 
plots  are  summarized  in  Tables  5,  6,  7,  and  8.  The  results  for  each  statistic  may  be  summa¬ 
rized  as  follows: 

Xk  : 


•  Under  Hi,  the  partial  Bayes  factors  typically  provide  weak  to  positive  evidence  for  the 
(correct)  minefield  hypothesis. 

•  Under  Hq,  the  partial  Bayes  factors  typically  range  from  positive  evidence  for  Hq  to 
weak  evidence  for  Hi.  The  median  partial  Bayes  factor  is  approximately  1.7  in  favor 
of  the  correct  hypothesis. 

•  The  increase  in  noise  had  a  small  negative  effect  on  the  performance  of  the  partial 
Bayes  factors  under  both  hypotheses. 
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%  Evidence  For  Hq 

%  Evidence  For  Hi 

Prior 

Noise 

Level 

Decisive 

(  —  00,-10] 

Strong 

(-10,-5] 

Positive 

(-5,-2] 

Weak 

(-2,0] 

Weak 

(0,2] 

Positive 

(2,5] 

Strong 

(5,10] 

Decisive 

(10, oo) 

Diffuse 

Low 

0 

0 

1 

3 

74 

22 

0 

0 

Compact 

Low 

0 

0 

1 

5 

15 

79 

0 

0 

Tight 

Low 

0 

0 

1 

9 

17 

73 

0 

0 

Perfect 

Low 

0 

1 

0 

10 

16 

47 

26 

0 

Diffuse 

High 

0 

0 

8 

5 

57 

30 

0 

0 

Compact 

High 

0 

0 

8 

15 

15 

62 

0 

0 

Tight 

High 

0 

0 

9 

19 

13 

59 

0 

0 

Perfect 

High 

0 

0 

6 

22 

22 

50 

0 

0 
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%  Evidence  For  Hq 

%  Evidence  For  Hi 

Prior 

Noise 

Level 

Decisive 

(  —  00,-10] 

Strong 

(-10,-5] 

Positive 

(-5,-2] 

Weak 

(-2,0] 

Weak 

(0,2] 

Positive 

(2,5] 

Strong 

(5,10] 

Decisive 

(10, oo) 

Diffuse 

Low 

0 

0 

30 

30 

31 

9 

0 

0 

Compact 

Low 

0 

0 

29 

39 

16 

16 

0 

0 

Tight 

Low 

0 

1 

36 

32 

16 

15 

0 

0 

Perfect 

Low 

3 

24 

30 

21 

9 

11 

2 

0 

Diffuse 

High 

0 

0 

40 

21 

34 

5 

0 

0 

Compact 

High 

0 

0 

41 

31 

16 

12 

0 

0 

Tight 

High 

0 

0 

45 

30 

14 

11 

0 

0 

Perfect 

High 

2 

9 

28 

36 

19 

6 

0 

0 
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%  Evidence  For  Hq 

%  Evidence  For  Hi 

Prior 

Noise 

Level 

Decisive 

(  —  00,-10] 

Strong 

(-10,-5] 

Positive 

(-5,-2] 

Weak 

(-2,0] 

Weak 

(0,2] 

Positive 

(2,5] 

Strong 

(5,10] 

Decisive 

(10, oo) 

Diffuse 

Low 

0 

0 

5 

35 

22 

24 

13 

1 

Compact 

Low 

0 

0 

7 

26 

24 

29 

14 

0 

Tight 

Low 

0 

0 

7 

26 

21 

33 

13 

0 

Perfect 

Low 

0 

4 

8 

22 

19 

28 

19 

0 

Diffuse 

High 

0 

0 

5 

42 

27 

21 

5 

0 

Compact 

High 

0 

0 

5 

30 

33 

27 

5 

0 

Tight 

High 

0 

0 

6 

28 

31 

30 

5 

0 

Perfect 

High 

0 

0 

6 

27 

29 

33 

5 

0 
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Table  8:  Breakdown  or  partial  Bayes  factors  tor  iti,  based  on  Ay,  under  Hq. 


%  Evidence  For  Hq 

%  Evidence  For  Hi 

Prior 

Noise 

Level 

Decisive 

(  —  00,-10] 

Strong 

(-10,-5] 

Positive 

(-5,-2] 

Weak 

(-2,0] 

Weak 

(0,2] 

Positive 

(2,5] 

Strong 

(5,10] 

Decisive 

(10, oo) 

Diffuse 

Low 

0 

0 

54 

36 

9 

1 

0 

0 

Compact 

Low 

1 

4 

55 

27 

11 

2 

0 

0 

Tight 

Low 

6 

6 

47 

27 

9 

5 

0 

0 

Perfect 

Low 

5 

26 

42 

15 

7 

5 

0 

0 

Diffuse 

High 

0 

0 

47 

39 

12 

2 

0 

0 

Compact 

High 

0 

0 

47 

37 

12 

4 

0 

0 

Tight 

High 

8 

2 

38 

32 

14 

6 

0 

0 

Perfect 

High 

1 

11 

37 

28 

17 

6 

0 

0 
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Table  4:  Simulation  study:  Percentage  of  misclassifications. 


Prior 

Noise 

Level 

K-function 

Voronoi' 

Hi 

Ho 

Hi 

Ho 

Diffuse 

Low 

4 

40 

40 

10 

Compact 

Low 

6 

32 

33 

13 

Tight 

Low 

10 

31 

33 

14 

Perfect 

Low 

11 

22 

34 

12 

Diffuse 

High 

13 

39 

47 

14 

Compact 

High 

23 

28 

35 

16 

Tight 

High 

28 

25 

34 

20 

Perfect 

High 

28 

25 

33 

23 

Average  Error  Rate 

15 

30 

36 

15 

Total  Error  Rate 

22.5 

25.5 

•  Under  ffi,  the  partial  Bayes  factors  typically  range  from  positive  evidence  for  Hi  to 
weak  evidence  for  Hq.  The  median  partial  Bayes  factor  is  again  approximately  1.7  in 
favor  of  the  correct  hypothesis. 

•  Under  the  evidence  for  Hq  typically  ranges  from  weak  to  positive. 

•  There  is  a  negligible  negative  effect  in  the  performance  of  the  partial  Bayes  factors  due 
to  the  increase  in  noise. 

6.5  Minefield  Data 

Figure  9  shows  the  locations  of  mines  (o)  and  noise  events  (+)  on  a  surf  beach.  The  dataset, 
which  we  shall  refer  to  as  the  Surf  Zone  dataset,  was  described  and  analyzed  in  Lake  and 
Keenan  (1995),  Lake,  Sadler,  and  Casey  (1997)  and  Walsh  and  Raftery  (2002).  One  should 
note  that  the  mines  are  approximately  arranged  evenly  spaced  in  parallel  rows.  Since  the 
rows  are  further  apart  than  consecutive  mines  within  a  row,  the  mines  do  not  resemble  a 
typical  Strauss  process.  However  since  the  minefield  does  display  inhibition,  the  Strauss 
model  is  a  useful  first  approximation  to  the  minefield  process. 

In  order  to  account  for  edge  effects  we  analyzed  the  points  lying  in  the  central  square 
region  shown  in  Figure  9.  This  region  contains  40  events,  20  mines  and  20  noise  events.  The 
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Figure  9:  Surf  Zone  Data. 

intermine  distance  is  approximately  half  of  pmax,  so  we  used  the  same  priors  to  analyze  this 
dataset  as  were  used  in  the  simulation  study. 

The  partial  Bayes  factors  calculated  based  on  Xk  and  Xy  for  each  of  the  three  priors 
are  shown  in  Table  9.  We  can  see  that  the  partial  Bayes  factors  based  on  Xk  provide  weak 
evidence  for  the  minefield  hypothesis.  Since  the  mines  are  not  actually  laid  as  Strauss  process 
this  result  is  reasonably  good. 

Table  9:  Twice  the  Log  Partial  Bayes  factors  for  iLi,  for  the  Minefield  Dataset. 


Prior 

K-function 

Voronoi 

Diffuse 

1.64 

-1.47 

Compact 

1.87 

-1.26 

Tight 

2.02 

-1.31 

However  the  partial  Bayes  factors  based  on  Xy  provide  weak  evidence  against  the  mine¬ 
field  hypothesis.  It  is  apparent  from  these  results  that  the  iP-function  was  better  than  the 
Voronoi  tiling  variance  at  capturing  the  regularities  in  this  dataset. 
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7  Discussion 


In  this  paper  we  investigated  the  feasibility  of  using  partial  Bayes  factors  to  classify  mixtures 
of  spatial  point  processes.  We  limited  our  attention  to  two  different  summary  statistics  on 
the  basis  of  which  to  calculate  the  partial  Bayes  factors.  One  summary  statistic  was  based  on 
the  K-function,  and  the  other  on  the  Voronoi  tessellation.  We  performed  a  simulation  study, 
and  found  that  partial  Bayes  factors  based  on  both  statistics  provided  good  discrimination 
between  the  competing  hypotheses  we  considered.  We  also  applied  our  method  to  real 
minefield  data  and  found  the  the  statistic  based  on  the  K-function  was  more  successful  at 
detecting  the  minefield  in  our  dataset. 

A  previous  approach  to  this  problem  using  a  supervised  pattern  recognition  scheme  based 
on  summary  statistics  of  the  point  pattern  was  developed  by  Raghavan,  Goel,  and  Ghosh 
(1997,  1998).  Our  approach  has  the  advantage  of  providing  a  natural  framework  within 
which  to  include  prior  information  about  each  competing  hypothesis,  which  can  be  very 
useful  in  this  kind  of  application  when  it  is  available. 

Our  approach  is  motivated  by  the  problem  of  having  a  statistical  model  from  which  we 
can  simulate  data,  but  which  has  a  likelihood  that  is  difficult  to  evaluate.  The  task  of 
parameter  estimation  in  this  setting  was  investigated  by  Diggle  and  Gratton  (1984).  Their 
approach  was  to  use  simulated  realizations  from  an  ‘implicit’  statistical  model,  and  kernel 
estimation,  to  estimate  the  log-likelihood  function  and  then  to  maximize  this  function  via  a 
modified  simplex  algorithm. 

More  recently,  Harshman  and  Clark  (1998)  used  a  simulation  based  maximum  likelihood 
method  for  estimation  of  parameters  in  a  sperm  competition  model.  As  in  our  method,  they 
reduced  the  data  to  an  approximately  sufficient  summary  statistic.  In  this  paper  we  have 
limited  our  attention  to  the  problem  of  classifying  spatial  point  processes.  However,  the 
partial  Bayes  factor  methodology  is  clearly  applicable  in  other  situations. 
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